Goto

Collaborating Authors

 optimal classification tree


BooleanOCT: Optimal Classification Trees based on multivariate Boolean Rules

Tu, Jiancheng, Fan, Wenqi, Wu, Zhibin

arXiv.org Artificial Intelligence

The global optimization of classification trees has demonstrated considerable promise, notably in enhancing accuracy, optimizing size, and thereby improving human comprehensibility. While existing optimal classification trees substantially enhance accuracy over greedy-based tree models like CART, they still fall short when compared to the more complex black-box models, such as random forests. To bridge this gap, we introduce a new mixed-integer programming (MIP) formulation, grounded in multivariate Boolean rules, to derive the optimal classification tree. Our methodology integrates both linear metrics, including accuracy, balanced accuracy, and cost-sensitive cost, as well as nonlinear metrics such as the F1-score. The approach is implemented in an open-source Python package named BooleanOCT. We comprehensively benchmark these methods on the 36 datasets from the UCI machine learning repository. The proposed models demonstrate practical solvability on real-world datasets, effectively handling sizes in the tens of thousands. Aiming to maximize accuracy, this model achieves an average absolute improvement of 3.1\% and 1.5\% over random forests in small-scale and medium-sized datasets, respectively. Experiments targeting various objectives, including balanced accuracy, cost-sensitive cost, and F1-score, demonstrate the framework's wide applicability and its superiority over contemporary state-of-the-art optimal classification tree methods in small to medium-scale datasets.


ODTlearn: A Package for Learning Optimal Decision Trees for Prediction and Prescription

Vossler, Patrick, Aghaei, Sina, Justin, Nathan, Jo, Nathanael, Gómez, Andrés, Vayanos, Phebe

arXiv.org Machine Learning

ODTLearn is an open-source Python package that provides methods for learning optimal decision trees for high-stakes predictive and prescriptive tasks based on the mixed-integer optimization (MIO) framework proposed in (Aghaei et al., 2019) and several of its extensions. The current version of the package provides implementations for learning optimal classification trees, optimal fair classification trees, optimal classification trees robust to distribution shifts, and optimal prescriptive trees from observational data. We have designed the package to be easy to maintain and extend as new optimal decision tree problem classes, reformulation strategies, and solution algorithms are introduced. To this end, the package follows object-oriented design principles and supports both commercial (Gurobi) and open source (COIN-OR branch and cut) solvers.


Machine Learning NeEDS Mathematical Optimization

#artificialintelligence

Abstract: In recent years there has been growing attention to interpretable machine learning models which can give explanatory insights on their behavior. Thanks to their interpretability, decision trees have been intensively studied for classification tasks, and due to the remarkable advances in mixed-integer programming (MIP), various approaches have been proposed to formulate the problem of training an Optimal Classification Tree (OCT) as a MIP model. We present a novel mixed-integer quadratic formulation for the OCT problem, which exploits the generalization capabilities of Support Vector Machines for binary classification. Our model, denoted as Maximum Margin Optimal Classification Tree (MARGOT), encompasses the use of maximum margin multivariate hyperplanes nested in a binary tree structure. To enhance the interpretability of our approach, we analyse two alternative versions of MARGOT, which include feature selection constraints inducing local sparsity of the hyperplanes.


Robust Optimal Classification Trees under Noisy Labels

Blanco, Víctor, Japón, Alberto, Puerto, Justo

arXiv.org Machine Learning

In this paper we propose a novel methodology to construct Optimal Classification Trees that takes into account that noisy labels may occur in the training sample. Our approach rests on two main elements: (1) the splitting rules for the classification trees are designed to maximize the separation margin between classes applying the paradigm of SVM; and (2) some of the labels of the training sample are allowed to be changed during the construction of the tree trying to detect the label noise. Both features are considered and integrated together to design the resulting Optimal Classification Tree. We present a Mixed Integer Non Linear Programming formulation for the problem, suitable to be solved using any of the available off-the-shelf solvers. The model is analyzed and tested on a battery of standard datasets taken from UCI Machine Learning repository, showing the effectiveness of our approach.


MurTree: Optimal Classification Trees via Dynamic Programming and Search

Demirović, Emir, Lukina, Anna, Hebrard, Emmanuel, Chan, Jeffrey, Bailey, James, Leckie, Christopher, Ramamohanarao, Kotagiri, Stuckey, Peter J.

arXiv.org Artificial Intelligence

Decision tree learning is a widely used approach in machine learning, favoured in applications that require concise and interpretable models. Heuristic methods are traditionally used to quickly produce models with reasonably high accuracy. A commonly criticised point, however, is that the resulting trees may not necessarily be the best representation of the data in terms of accuracy, size, and other considerations such as fairness. In recent years, this motivated the development of optimal classification tree algorithms that globally optimise the decision tree in contrast to heuristic methods that perform a sequence of locally optimal decisions. We follow this line of work and provide a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our algorithm supports constraints on the depth of the tree and number of nodes and we argue it can be extended with other requirements. The success of our approach is attributed to a series of specialised techniques that exploit properties unique to classification trees. Whereas algorithms for optimal classification trees have traditionally been plagued by high runtimes and limited scalability, we show in a detailed experimental study that our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances, providing several orders of magnitude improvements and notably contributing towards the practical realisation of optimal decision trees.


Interpretable Clustering via Optimal Trees

Bertsimas, Dimitris, Orfanoudaki, Agni, Wiberg, Holly

arXiv.org Machine Learning

State-of-the-art clustering algorithms use heuristics to partition the feature space and provide little insight into the rationale for cluster membership, limiting their interpretability. In healthcare applications, the latter poses a barrier to the adoption of these methods since medical researchers are required to provide detailed explanations of their decisions in order to gain patient trust and limit liability. We present a new unsupervised learning algorithm that leverages Mixed Integer Optimization techniques to generate interpretable tree-based clustering models. Utilizing the flexible framework of Optimal Trees [1], our method approximates the globally optimal solution leading to high quality partitions of the feature space. Our algorithm, can incorporate various internal validation metrics, naturally determines the optimal number of clusters, and is able to account for mixed numeric and categorical data. It achieves comparable or superior performance on both synthetic and real world datasets when compared to K-Means while offering significantly higher interpretability.